Segmentation and Recognition of Handwritten Kannada Text Using Relevance Feedback and Histogram of Oriented Gradients – A Novel Approach

نویسنده

  • Karthik S
چکیده

India is a multilingual country with 22 official languages and more than 1600 languages in existence. Kannada is one of the official languages and widely used in the state of Karnataka whose population is over 65 million. Kannada is one of the south Indian languages and it stands in the 33rd position among the list of widely spoken languages across the world. However, the survey reveals that much more effort is required to develop a complete Optical Character Recognition (OCR) system. In this direction the present research work throws light on the development of suitable methodology to achieve the goal of developing an OCR. It is noted that the overall accuracy of the OCR system largely depends on the accuracy of the segmentation phase. So it is desirable to have a robust and efficient segmentation method. In this paper, a method has been proposed for proper segmentation of the text to improve the performance of OCR at the later stages. In the proposed method, the segmentation has been done using horizontal projection profile and windowing. The result obtained is passed to the recognition module. The Histogram of Oriented Gradient (HoG) is used for the recognition in combination with the support vector machine (SVM). The result is taken as the feedback and fed to the segmentation module to improve the accuracy. The experimentation is delivered promising results. Keywords—Optical character recognition; Histogram of oriented gradients; relevance feedback; segmentation; Support Vector Machine; handwritten Kannada documents

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Kannada Handwritten Character Recognition for Automatic Form Processing

Data processing and management is common now a days. In this paper, automatic processing of forms written in Kannada language is considered. A suitable pre-processing technique is presented for extracting handwritten characters. Principal Component Analysis (PCA) and Histogram of oriented Gradients (HoG) are used for feature extraction. These features are fed to multilayer feed forward back pro...

متن کامل

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

Recognition of Handwritten Digits using Histogram of Oriented Gradients

Off-line recognition of text plays a significant role in several applications, such as cheque verification and mail sorting. However, the selection of the technique for feature extraction remains a big challenging step for achieving high recognition accuracy. This paper presents an efficient handwritten digit recognition system based on HOG to capture the discriminative features of digit image....

متن کامل

Implicit segmentation of Kannada characters in offline handwriting recognition using hidden Markov models

We describe a method for classification of handwritten Kannada characters using Hidden Markov Models (HMMs). Kannada script is agglutinative, where simple shapes are concatenated horizontally to form a character. This results in a large number of characters making the task of classification difficult. Character segmentation plays a significant role in reducing the number of classes. Explicit se...

متن کامل

OCR for printed Kannada text to Machine editable format using Database approach

This paper describes an Optical Character Recognition (OCR) system for printed text documents in Kannada, a South Indian language. The proposed OCR system for the recognition of printed Kannada text, which can handle all types of Kannada characters. The system first extracts image of Kannada scripts, then from the image to line segmentation then segments the words into sub-character level piece...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016